Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Avoid loss of precision when casting in packet loading (alternative) #786

Merged

Conversation

greglucas
Copy link
Collaborator

Change Summary

Overview

When using derived values there can be situations where a linear conversion factor is applied to a uint8 value to turn a raw measurement into a float temperature value for instance. These are represented as a small uint datatype onboard, but need to be represented as a float or larger integer datatype on the ground so we don't lose precision. Previously we were getting 2.1 cast to 2 after the derived types were attempted to be cast to their onboard types.

closes #780

Copy link
Contributor

@subagonsouth subagonsouth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just one question about 64 vs 32 bit floats.

_create_minimum_dtype_array(
list_of_values, dtype=datatype_mapping[apid][key]
),
np.asarray(list_of_values, dtype=datatype_mapping[apid][key]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does numpy default to 64-bit floats? And if so, do we have any way to get the CDF products to use 32-bit floats where we want them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes.

I am actually getting 32-bit floats in another packet definition where the floats are brought down directly in the telemetry stream, so space_packet_parser emits the Python floats and then we can use the bits to determine 32/64 bits correctly:

elif isinstance(data_encoding, xtcedef.FloatDataEncoding):
datatype = "float"
if nbits == 32:
datatype += "32"
else:
datatype += "64"

This situation is purely for when a conversion factor gets applied and we just shrug and say, not much I can know about it from the packet definition since you told me it was unsigned 8 bits in the packet, but then you multiplied it by 2.5 and all bets are off for what you want me to turn that into (float, new int, string enumeration, ...).

Ultimately, this gets to your earlier suggestions though that this is really just a "first cut" and we should be putting dtype into the CDF metadata attributes yaml definitions and doing the casting there when creating the datasets and saving the products.

When using derived values there can be situations where a linear
conversion factor is applied to a uint8 value to turn a raw measurement
into a float temperature value for instance. These are represented
as a small uint datatype onboard, but need to be represented as a
float or larger integer datatype on the ground so we don't lose
precision. Previously we were getting 2.1 cast to 2 after the
derived types were attempted to be cast to their onboard types.
@greglucas greglucas merged commit 164aca0 into IMAP-Science-Operations-Center:dev Aug 30, 2024
17 checks passed
@greglucas greglucas deleted the min-dtype-fix-xtce branch August 30, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
packet parsing Related to packet parsing or XTCE
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG - Reading in derived float values loses precision
2 participants